[BugFix] [Enhancement] Fix nullptr and support Iceberg null padding #49212

Samrose-Ahmed · 2024-07-31T03:52:48Z

PR #48151 introduced a regression, where what would previously return an Error now caused a nullptr dereference and crashed the entire CN. This change removes the crashing and, for Iceberg, also adds support for padding evolved fields with null values, as per Iceberg spec.

Why I'm doing:

What I'm doing:

Fixes #issue

What type of PR is this:

Does this PR entail a change in behavior?

Yes, this PR will result in a change in behavior.
No, this PR will not result in a change in behavior.

If yes, please specify the type of change:

Interface/UI changes: syntax, type conversion, expression evaluation, display information
Parameter changes: default values, similar parameters but with different default values
Policy changes: use new policy to replace old one, functionality automatically enabled
Feature removed
Miscellaneous: upgrade & downgrade compatibility, etc.

Checklist:

I have added test cases for my bug fix or my new feature
This pr needs user documentation (for new or modified features or behaviors)
- I have added documentation for my new feature or new function
This is a backport pr

Bugfix cherry-pick branch check:

be/src/formats/parquet/null_column_reader.h

Samrose-Ahmed · 2024-07-31T04:09:47Z

be/src/formats/parquet/group_reader.cpp

@@ -274,6 +275,14 @@ Status GroupReader::_create_column_reader(const GroupReaderParam::Column& column
            RETURN_IF_ERROR(ColumnReader::create(_column_reader_opts, schema_node, column.slot_type(),
                                                 column.t_iceberg_schema_field, &column_reader));
        }
+        if (column_reader == nullptr) {
+            if (column.t_iceberg_schema_field == nullptr) {
+                return Status::InternalError("Invalid file: No valid column reader.");


I combined the bugfix and enhancement. Lmk any comments on the null padding behavior, otherwise can split pr and at least just add return error status so doesnt segfault at e.g. column_reader->set_need_parse_levels.

@Samrose-Ahmed maybe it will be better that split the bugfix and enhancement, as i know, we only materialize column that has at least one subfiled in parquet reader, for a padding column, it's padded at scanner level. cc @Smith-Cruise

Yes its for struct evolution in Iceberg e.g. you had event: struct<action: string> and you evolved schema to event: struct<action: string, id:string>, according to Iceberg spec, if you select event.id you need to pad the old files that don't have event.id column with null.

Yes, for this case, we want to query event.id and the former files don't contains this subfield, i actually think we have covered this case, and treat the column as column that not need materialized in the reader.

I debugged and it crashes on one of our Iceberg tables. I added a unit test to replicate.

Smith-Cruise · 2024-07-31T05:33:43Z

can you put crash log here?

be/src/formats/parquet/null_column_reader.h

Samrose-Ahmed · 2024-07-31T07:32:28Z

can you put crash log here?

The crash is at this line when column_reader == nullptr:

starrocks/be/src/formats/parquet/group_reader.cpp

Line 282 in b93405b

column_reader->set_need_parse_levels(true);

Log:

main DEBUG (build 20cb672)
query_id:cf9af543-4f0e-11ef-ad60-3a56daa8801c, fragment_instance:cf9af543-4f0e-11ef-ad60-3a56daa8801e
Hive file path: /data/t8D0YA/ts_hour=2024-07-31-07/b2fd9c85-2c13-4545-b819-7fe66ca9566e.parquet, partition id: -1, length: 11181, offset: 4
tracker:process consumption: 477005088
tracker:query_pool consumption: 25234016
tracker:query_pool/connector_scan consumption: 117899264
tracker:load consumption: 0
tracker:metadata consumption: 0
tracker:tablet_metadata consumption: 0
tracker:rowset_metadata consumption: 0
tracker:segment_metadata consumption: 0
tracker:column_metadata consumption: 0
tracker:tablet_schema consumption: 0
tracker:segment_zonemap consumption: 0
tracker:short_key_index consumption: 0
tracker:column_zonemap_index consumption: 0
tracker:ordinal_index consumption: 0
tracker:bitmap_index consumption: 0
tracker:bloom_filter_index consumption: 0
tracker:compaction consumption: 0
tracker:schema_change consumption: 0
tracker:column_pool consumption: 0
tracker:page_cache consumption: 0
tracker:jit_cache consumption: 0
tracker:update consumption: 0
tracker:chunk_allocator consumption: 0
tracker:clone consumption: 0
tracker:consistency consumption: 0
tracker:datacache consumption: 3184743
tracker:replication consumption: 0
*** Aborted at 1722411050 (unix time) try "date -d @1722411050" if you are using GNU date ***
I20240731 00:30:50.066682 126608369256128 logconfig.cpp:131] Start to release memory of cache
I20240731 00:30:50.066693 126608369256128 logconfig.cpp:133] Release memory of cache success
I20240731 00:30:50.066959 126608369256128 logconfig.cpp:147] je_mallctl execute purge success
I20240731 00:30:50.069147 126608369256128 logconfig.cpp:155] je_mallctl execute dontdump success
PC: @         0x1163d5cd starrocks::parquet::GroupReader::_create_column_reader(starrocks::parquet::GroupReaderParam::Column const&)
*** SIGSEGV (@0x0) received by PID 47216 (TID 0x73264ee006c0) from PID 0; stack trace: ***
    @     0x732a5e4a1ec3 (/usr/lib/x86_64-linux-gnu/libc.so.6+0xa1ec2)
    @         0x149dfaf9 google::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*)
    @     0x732a5e445320 (/usr/lib/x86_64-linux-gnu/libc.so.6+0x4531f)
    @         0x1163d5cd starrocks::parquet::GroupReader::_create_column_reader(starrocks::parquet::GroupReaderParam::Column const&)
    @         0x1163d287 starrocks::parquet::GroupReader::_init_column_readers()
    @         0x1163ac09 starrocks::parquet::GroupReader::init()
    @         0x115c3610 starrocks::parquet::FileReader::_init_group_readers()
    @         0x115be4a4 starrocks::parquet::FileReader::init(starrocks::HdfsScannerContext*)
    @         0x1131b55a starrocks::HdfsParquetScanner::do_open(starrocks::RuntimeState*)
    @         0x112f5137 starrocks::HdfsScanner::open(starrocks::RuntimeState*)
    @         0x11244574 starrocks::connector::HiveDataSource::_init_scanner(starrocks::RuntimeState*)
    @         0x1123d529 starrocks::connector::HiveDataSource::open(starrocks::RuntimeState*)
    @          0xc90edb0 starrocks::pipeline::ConnectorChunkSource::_open_data_source(starrocks::RuntimeState*, bool*)
    @          0xc90f18f starrocks::pipeline::ConnectorChunkSource::_read_chunk(starrocks::RuntimeState*, std::shared_ptr<starrocks::Chunk>*)
    @          0xd1c0d97 starrocks::pipeline::ChunkSource::buffer_next_batch_chunks_blocking(starrocks::RuntimeState*, unsigned long, starrocks::workgroup::WorkGroup const*)
    @          0xc8e1c18 auto starrocks::pipeline::ScanOperator::_trigger_next_scan(starrocks::RuntimeState*, int)::{lambda(auto:1&)#1}::operator()<starrocks::workgroup::YieldContext>(starrocks::workgroup::YieldContext&) const
    @          0xc8e4dd5 void std::__invoke_impl<void, starrocks::pipeline::ScanOperator::_trigger_next_scan(starrocks::RuntimeState*, int)::{lambda(auto:1&)#1}&, starrocks::workgroup::YieldContext&>(std::__invoke_other, starrocks::pipeline::ScanOperator::_trigger_next_scan(starro.
    @          0xc8e4cae std::enable_if<is_invocable_r_v<void, starrocks::pipeline::ScanOperator::_trigger_next_scan(starrocks::RuntimeState*, int)::{lambda(auto:1&)#1}&, starrocks::workgroup::YieldContext&>, void>::type std::__invoke_r<void, starrocks::pipeline::ScanOperator::_tr.
    @          0xc8e4ae2 std::_Function_handler<void (starrocks::workgroup::YieldContext&), starrocks::pipeline::ScanOperator::_trigger_next_scan(starrocks::RuntimeState*, int)::{lambda(auto:1&)#1}>::_M_invoke(std::_Any_data const&, starrocks::workgroup::YieldContext&)
    @          0xcb6c083 std::function<void (starrocks::workgroup::YieldContext&)>::operator()(starrocks::workgroup::YieldContext&) const
    @          0xcb6b56f starrocks::workgroup::ScanTask::run()
    @          0xcbe1b7f starrocks::workgroup::ScanExecutor::worker_thread()
    @          0xcbe182d starrocks::workgroup::ScanExecutor::initialize(int)::{lambda()#1}::operator()() const
    @          0xcbe2be6 void std::__invoke_impl<void, starrocks::workgroup::ScanExecutor::initialize(int)::{lambda()#1}&>(std::__invoke_other, starrocks::workgroup::ScanExecutor::initialize(int)::{lambda()#1}&)
    @          0xcbe28d2 std::enable_if<is_invocable_r_v<void, starrocks::workgroup::ScanExecutor::initialize(int)::{lambda()#1}&>, void>::type std::__invoke_r<void, starrocks::workgroup::ScanExecutor::initialize(int)::{lambda()#1}&>(starrocks::workgroup::ScanExecutor::initialize(.
    @          0xcbe2578 std::_Function_handler<void (), starrocks::workgroup::ScanExecutor::initialize(int)::{lambda()#1}>::_M_invoke(std::_Any_data const&)
    @          0xb0e9a90 std::function<void ()>::operator()() const
    @          0xb76c6ba starrocks::FunctionRunnable::run()
    @          0xb76afb3 starrocks::ThreadPool::dispatch_thread()
    @          0xb779724 void std::__invoke_impl<void, void (starrocks::ThreadPool::*&)(), starrocks::ThreadPool*&>(std::__invoke_memfun_deref, void (starrocks::ThreadPool::*&)(), starrocks::ThreadPool*&)
    @          0xb778bf1 std::__invoke_result<void (starrocks::ThreadPool::*&)(), starrocks::ThreadPool*&>::type std::__invoke<void (starrocks::ThreadPool::*&)(), starrocks::ThreadPool*&>(void (starrocks::ThreadPool::*&)(), starrocks::ThreadPool*&)
    @          0xb777f55 void std::_Bind<void (starrocks::ThreadPool::*(starrocks::ThreadPool*))()>::__call<void, , 0ul>(std::tuple<>&&, std::_Index_tuple<0ul>)
Segmentation fault (core dumped)

be/src/formats/parquet/column_reader.cpp

be/src/formats/parquet/group_reader.cpp

zombee0 · 2024-08-05T02:48:06Z

@Samrose-Ahmed i think the root cause is in IcebergMetaHelper::prepare_read_columns, we checked the validation with iceberg schema, we should check it with materialized_columns schema.

if (!_is_valid_type(parquet_field, iceberg_it->second)) {
            continue;
}

Smith-Cruise · 2024-08-05T02:54:44Z

IcebergMetaHelper::_is_valid_type() should check based on TIcebergSchemaField and TypeDescriptor both.

You can take a look at bool ParquetMetaHelper::_is_valid_type(), it's checked by TypeDescriptor.

Samrose-Ahmed · 2024-08-05T04:05:01Z

That makes sense actually I'll update the PR

Samrose-Ahmed · 2024-08-05T04:43:02Z

I've updated the PR.

zombee0

LGTM, for clang-format, could you run ./clang-format.sh in /path/to/starrocks/code/build-support

zombee0 · 2024-08-05T11:50:57Z

be/src/formats/parquet/group_reader.cpp

@@ -274,6 +274,10 @@ Status GroupReader::_create_column_reader(const GroupReaderParam::Column& column
            RETURN_IF_ERROR(ColumnReader::create(_column_reader_opts, schema_node, column.slot_type(),
                                                 column.t_iceberg_schema_field, &column_reader));
        }
+        if (column_reader == nullptr) {
+            // this shouldn't happen but guard
+            return Status::InternalError("No valid column reader.");


Samrose-Ahmed · 2024-08-06T03:23:13Z

This approach doesn't seem to completely work.

I updated it to use field ids but now the new test (TestStructEvolutionPadNull) fails at the same original issue (now returns 'No valid column reader', would crash before w/o check).

The suggested change is fine to handle robustness for type check but it doesn't seem to handle the root issue, which is you need to pad the nested subfield with default value, that doesn't happen with this code (like comment mentions we put nullptr column reader in children_reader, we will append default value for this subfield later. but its never appended later).

It seems like we do need the change I made earlier? thoughts @Smith-Cruise @zombee0

be/src/formats/parquet/meta_helper.cpp

PR StarRocks#48151 introduced a regression, where what would previously return an Error now caused a nullptr dereference and crashed the entire CN. This change fixes to handle the case and return nulls. Signed-off-by: Samrose Ahmed <[email protected]>

Smith-Cruise · 2024-08-07T07:05:55Z

branch 3.2 you need backport either, plz check it.

github-actions · 2024-08-07T08:32:22Z

[FE Incremental Coverage Report]

✅ pass : 0 / 0 (0%)

github-actions · 2024-08-07T08:43:16Z

[BE Incremental Coverage Report]

✅ pass : 27 / 28 (96.43%)

file detail

	path	covered_line	new_line	coverage	not_covered_line_detail
🔵	be/src/formats/parquet/group_reader.cpp	1	2	50.00%	[306]
🔵	be/src/formats/parquet/meta_helper.cpp	26	26	100.00%	[]

github-actions · 2024-08-07T11:09:10Z

@Mergifyio backport branch-3.3

github-actions · 2024-08-07T11:09:11Z

@Mergifyio backport branch-3.2

mergify · 2024-08-07T11:09:19Z

backport branch-3.3

✅ Backports have been created

#49524 [BugFix] [Enhancement] Fix nullptr and support Iceberg null padding (backport #49212) has been created for branch branch-3.3

mergify · 2024-08-07T11:09:20Z

backport branch-3.2

✅ Backports have been created

#49525 [BugFix] [Enhancement] Fix nullptr and support Iceberg null padding (backport #49212) has been created for branch branch-3.2

…49212) Signed-off-by: Samrose Ahmed <[email protected]> (cherry picked from commit e224d5d)

…backport #49212) (#49525) Co-authored-by: Samrose <[email protected]>

…backport #49212) (#49524) Co-authored-by: Samrose <[email protected]>

Samrose-Ahmed requested a review from a team as a code owner July 31, 2024 03:52

github-actions bot added the behavior_changed label Jul 31, 2024

mergify bot assigned Samrose-Ahmed Jul 31, 2024

starrocks-cr bot reviewed Jul 31, 2024

View reviewed changes

be/src/formats/parquet/null_column_reader.h Outdated Show resolved Hide resolved

Samrose-Ahmed commented Jul 31, 2024

View reviewed changes

Smith-Cruise reviewed Jul 31, 2024

View reviewed changes

be/src/formats/parquet/null_column_reader.h Outdated Show resolved Hide resolved

Samrose-Ahmed force-pushed the iceberg-schema-evol-pad branch 2 times, most recently from 9245031 to 76ca9a0 Compare July 31, 2024 07:54

Smith-Cruise reviewed Jul 31, 2024

View reviewed changes

be/src/formats/parquet/column_reader.cpp Outdated Show resolved Hide resolved

Samrose-Ahmed force-pushed the iceberg-schema-evol-pad branch from 76ca9a0 to 143b6aa Compare July 31, 2024 18:46

Smith-Cruise reviewed Aug 1, 2024

View reviewed changes

be/src/formats/parquet/group_reader.cpp Show resolved Hide resolved

Samrose-Ahmed force-pushed the iceberg-schema-evol-pad branch from 143b6aa to 169f130 Compare August 2, 2024 06:31

Samrose-Ahmed force-pushed the iceberg-schema-evol-pad branch from 169f130 to 7526bad Compare August 5, 2024 04:41

Samrose-Ahmed requested a review from Smith-Cruise August 5, 2024 04:43

Samrose-Ahmed force-pushed the iceberg-schema-evol-pad branch 2 times, most recently from cbb02fd to a7d3b29 Compare August 5, 2024 05:58

Samrose-Ahmed requested a review from zombee0 August 5, 2024 07:00

zombee0 approved these changes Aug 5, 2024

View reviewed changes

Samrose-Ahmed force-pushed the iceberg-schema-evol-pad branch 2 times, most recently from 43eb6fe to 2ec8151 Compare August 6, 2024 03:01

Samrose-Ahmed requested a review from zombee0 August 6, 2024 03:33

zombee0 reviewed Aug 6, 2024

View reviewed changes

be/src/formats/parquet/meta_helper.cpp Outdated Show resolved Hide resolved

Samrose-Ahmed force-pushed the iceberg-schema-evol-pad branch from 2ec8151 to 0248595 Compare August 6, 2024 20:48

Smith-Cruise approved these changes Aug 7, 2024

View reviewed changes

Samrose-Ahmed force-pushed the iceberg-schema-evol-pad branch from 0248595 to ede4edb Compare August 7, 2024 06:35

github-actions bot added the 3.3 label Aug 7, 2024

github-actions bot added the 3.2 label Aug 7, 2024

dirtysalt approved these changes Aug 7, 2024

View reviewed changes

packy92 approved these changes Aug 7, 2024

View reviewed changes

packy92 merged commit e224d5d into StarRocks:main Aug 7, 2024
66 of 68 checks passed

github-actions bot removed the 3.3 label Aug 7, 2024

github-actions bot removed the 3.2 label Aug 7, 2024

mergify bot pushed a commit that referenced this pull request Aug 7, 2024

[BugFix] [Enhancement] Fix nullptr and support Iceberg null padding (#…

c9e5d74

…49212) Signed-off-by: Samrose Ahmed <[email protected]> (cherry picked from commit e224d5d)

mergify bot pushed a commit that referenced this pull request Aug 7, 2024

[BugFix] [Enhancement] Fix nullptr and support Iceberg null padding (#…

0f1775b

…49212) Signed-off-by: Samrose Ahmed <[email protected]> (cherry picked from commit e224d5d)

This was referenced Aug 7, 2024

[BugFix] [Enhancement] Fix nullptr and support Iceberg null padding (backport #49212) #49524

Merged

[BugFix] [Enhancement] Fix nullptr and support Iceberg null padding (backport #49212) #49525

Merged

wanpengfei-git pushed a commit that referenced this pull request Aug 8, 2024

[BugFix] [Enhancement] Fix nullptr and support Iceberg null padding (…

1eff01e

…backport #49212) (#49525) Co-authored-by: Samrose <[email protected]>

github-actions bot added the 3.2-merged label Aug 8, 2024

wanpengfei-git pushed a commit that referenced this pull request Aug 8, 2024

[BugFix] [Enhancement] Fix nullptr and support Iceberg null padding (…

ffcce9d

…backport #49212) (#49524) Co-authored-by: Samrose <[email protected]>

github-actions bot added the 3.3-merged label Aug 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BugFix] [Enhancement] Fix nullptr and support Iceberg null padding #49212

[BugFix] [Enhancement] Fix nullptr and support Iceberg null padding #49212

Samrose-Ahmed commented Jul 31, 2024 •

edited

Loading

Samrose-Ahmed Jul 31, 2024

zombee0 Jul 31, 2024

Samrose-Ahmed Jul 31, 2024

zombee0 Aug 2, 2024

Samrose-Ahmed Aug 2, 2024

Smith-Cruise commented Jul 31, 2024

Samrose-Ahmed commented Jul 31, 2024

zombee0 commented Aug 5, 2024

Smith-Cruise commented Aug 5, 2024

Samrose-Ahmed commented Aug 5, 2024

Samrose-Ahmed commented Aug 5, 2024

zombee0 left a comment

zombee0 Aug 5, 2024

Samrose-Ahmed commented Aug 6, 2024

Smith-Cruise commented Aug 7, 2024

github-actions bot commented Aug 7, 2024

github-actions bot commented Aug 7, 2024

github-actions bot commented Aug 7, 2024

github-actions bot commented Aug 7, 2024

mergify bot commented Aug 7, 2024 •

edited

Loading

mergify bot commented Aug 7, 2024 •

edited

Loading

[BugFix] [Enhancement] Fix nullptr and support Iceberg null padding #49212

[BugFix] [Enhancement] Fix nullptr and support Iceberg null padding #49212

Conversation

Samrose-Ahmed commented Jul 31, 2024 • edited Loading

Why I'm doing:

What I'm doing:

What type of PR is this:

Checklist:

Bugfix cherry-pick branch check:

Samrose-Ahmed Jul 31, 2024

Choose a reason for hiding this comment

zombee0 Jul 31, 2024

Choose a reason for hiding this comment

Samrose-Ahmed Jul 31, 2024

Choose a reason for hiding this comment

zombee0 Aug 2, 2024

Choose a reason for hiding this comment

Samrose-Ahmed Aug 2, 2024

Choose a reason for hiding this comment

Smith-Cruise commented Jul 31, 2024

Samrose-Ahmed commented Jul 31, 2024

zombee0 commented Aug 5, 2024

Smith-Cruise commented Aug 5, 2024

Samrose-Ahmed commented Aug 5, 2024

Samrose-Ahmed commented Aug 5, 2024

zombee0 left a comment

Choose a reason for hiding this comment

zombee0 Aug 5, 2024

Choose a reason for hiding this comment

Samrose-Ahmed commented Aug 6, 2024

Smith-Cruise commented Aug 7, 2024

github-actions bot commented Aug 7, 2024

[FE Incremental Coverage Report]

github-actions bot commented Aug 7, 2024

[BE Incremental Coverage Report]

file detail

github-actions bot commented Aug 7, 2024

github-actions bot commented Aug 7, 2024

mergify bot commented Aug 7, 2024 • edited Loading

✅ Backports have been created

mergify bot commented Aug 7, 2024 • edited Loading

✅ Backports have been created

Samrose-Ahmed commented Jul 31, 2024 •

edited

Loading

mergify bot commented Aug 7, 2024 •

edited

Loading

mergify bot commented Aug 7, 2024 •

edited

Loading